Record: Order-16 Frozen N-gram Oracle + Learned Gate + TTT — val_bpb 0.0274 (3-seed mean) by TimPietrusky · Pull Request #945 · openai/parameter-golf

TimPietrusky · 2026-03-27T09:04:20Z

Record Summary

val_bpb: 0.02742 (3-seed mean, std 0.00003) | 8xH100 SXM | eval <=400s

3-Seed Results

Seed	val_bpb
1337	0.02744
42	0.02739
2025	0.02744
Mean	0.02742
Std	0.00003

Method

1. Order-16 Frozen N-gram Oracle

Pre-filled from all training shards at startup. 4M buckets, orders 2-16 with backoff. The oracle provides per-order n-gram probabilities that are blended with neural predictions.

2. Learned Multi-Expert Gate

A nn.Linear(512, 17) head (1 neural + 16 n-gram order experts) trained end-to-end with mixer loss (mixer_loss_weight=0.15). Predicts optimal per-token, per-order blending weights via softmax. Neural expert gets a 5% floor.

3. Complementary Training

Reduces CE loss weight for tokens well-predicted by the oracle (complement_alpha=0.5, complement_threshold=0.3). Forces the neural model to specialize on tokens the n-gram cache can't predict.

4. Score-First TTT

1 epoch AdamW (lr=0.001) on all blocks with adaptive temperature ([0.9, 1.05]) and byte-weighted loss. Unfreezes alpha_head, norms, scales, lm_head during TTT.

5. Model Architecture

11 layers, 512 dim, 8 heads, 8 KV heads
MLP 3.5x with LeakyReLU(0.5)²
XSA-all, partial RoPE (16 dims), VE(128) on layers 9-10
BigramHash (6144 vocab, 128 dim)
EMA(0.997), SWA every 50 steps, warmdown=3500
Int5 + zstd quantization with 3% pruning

Submission Checklist

One new folder under records/track_10min_16mb/
Included README.md, submission.json, train_gpt.py
3 train logs (seeds 1337, 42, 2025)
Eval <= 600s on 8xH100 (~400s)
Score-first evaluation maintained
N-gram oracle uses training data (legal status under review per RFC RFC: How to Clean Up All the Parameter Golf Submissions #886)

…0.0274 (3-seed mean)

- Update merged SOTA to 1.1194 (abaybektursun, was 1.1228 signalrush) - Add competition strategy pivot: n-gram eval cache now dominates (~0.02-0.97 bpb) - Document PR openai#727 (0.9674), openai#741 (0.9850), openai#945 (0.0274), openai#961 (0.0881) findings - Add Lessons Learned entries 17-20 on n-gram dominance + memorization risk - Update Technique Reference table with n-gram entries https://claude.ai/code/session_01Bpr2fKEnkNQmNKno8EnxWF

Merge remote's two-pass n-gram discoveries (PR openai#868 0.1181, PR openai#870 0.0935) with today's extreme n-gram findings (PR openai#945 0.0274, PR openai#961 0.0881). Keep Architecture Decisions and Legal TTT Protocol from remote. Add Lessons Learned 17-20 from 2026-03-27 research. https://claude.ai/code/session_01Bpr2fKEnkNQmNKno8EnxWF

valerio-oai · 2026-03-27T23:04:00Z

Thanks for your submission! Unfortunately, it's disallowed due to the use of hashed n-gram caches, which do not renormalize correctly / correctly reweight the LM's token distribution, look ahead to the target token to mix probabilities and therefore leak eval tokens. Please refer to the long discussion about this under the issues tab for more details, and please submit more runs in the future!

Record: Order-16 Frozen N-gram Oracle + Learned Gate + TTT — val_bpb …

06ff423

…0.0274 (3-seed mean)

notapplica mentioned this pull request Mar 27, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

haikosys mentioned this pull request Mar 27, 2026

Record: Fort Knox — Legal Packed Training Cache, Zero Val Adaptation (val_bpb 0.0638, 3-seed) #982

Closed

valerio-oai closed this Mar 27, 2026

valerio-oai mentioned this pull request Mar 27, 2026

Illegal submissions megathread #677

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Order-16 Frozen N-gram Oracle + Learned Gate + TTT — val_bpb 0.0274 (3-seed mean)#945

Record: Order-16 Frozen N-gram Oracle + Learned Gate + TTT — val_bpb 0.0274 (3-seed mean)#945
TimPietrusky wants to merge 1 commit intoopenai:mainfrom
TimPietrusky:submit/order16-frozen-oracle

TimPietrusky commented Mar 27, 2026

Uh oh!

valerio-oai commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

TimPietrusky commented Mar 27, 2026

Record Summary

3-Seed Results

Method

1. Order-16 Frozen N-gram Oracle

2. Learned Multi-Expert Gate

3. Complementary Training

4. Score-First TTT

5. Model Architecture

Submission Checklist

Uh oh!

valerio-oai commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants